274 research outputs found

    A Finite State and Data-Oriented Method for Grapheme to Phoneme Conversion

    Full text link
    A finite-state method, based on leftmost longest-match replacement, is presented for segmenting words into graphemes, and for converting graphemes into phonemes. A small set of hand-crafted conversion rules for Dutch achieves a phoneme accuracy of over 93%. The accuracy of the system is further improved by using transformation-based learning. The phoneme accuracy of the best system (using a large set of rule templates and a `lazy' variant of Brill's algoritm), trained on only 40K words, reaches 99% accuracy.Comment: 8 page

    Constraint-Based Categorial Grammar

    Full text link
    We propose a generalization of Categorial Grammar in which lexical categories are defined by means of recursive constraints. In particular, the introduction of relational constraints allows one to capture the effects of (recursive) lexical rules in a computationally attractive manner. We illustrate the linguistic merits of the new approach by showing how it accounts for the syntax of Dutch cross-serial dependencies and the position and scope of adjuncts in such constructions. Delayed evaluation is used to process grammars containing recursive constraints.Comment: 8 pages, LaTe

    Probing for Dutch Relative Pronoun Choice

    Get PDF

    Probing for Dutch Relative Pronoun Choice

    Get PDF
    We propose a linguistically motivated version of the relative pronoun probing task for Dutch (where a model has to predict whether a masked token is either die or dat), collect realistic data for it using a parsed corpus, and probe the performance of four context-sensitive bert-based neural language models. Whereas the original task, which simply masked all occurrences of the words die and dat, was relatively easy, the linguistically motivated task turns out to be much harder. Models differ considerably in their performance, but a monolingual model trained on a heterogeneous corpus appears to be most robust

    Using a Treebank for Finding Opposites

    Get PDF
    Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories. Editors: Markus Dickinson, Kaili Müürisep and Marco Passarotti. NEALT Proceedings Series, Vol. 9 (2010), 139-150. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15891

    Mining Discourse Treebanks with XQuery

    Get PDF
    Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories. Editors: Markus Dickinson, Kaili Müürisep and Marco Passarotti. NEALT Proceedings Series, Vol. 9 (2010), 245-256. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15891
    corecore